35 research outputs found

    Energy-Efficient Instruction Set Synthesis for Application-Specific Processors

    No full text
    Several techniques have been proposed to enhance the energy-efficiency of ASIPs (Application-Specific Instruction set Processors). While those techniques can reduce the energy consumption with a minimal change in the instruction set (IS), they fail to exploit the opportunity of designing the entire IS from the energy-efficiency perspective. In this paper, we present an energy-efficient IS synthesis technique that can comprehensively reduce the energy-delay product (EDP) of ASIPs through optimal instruction encoding, considering both the instruction bitwidth and the dynamic instruction count. Experimental results with a typical embedded RISC processor show that our technique can generate application-specific IS's that are up to 40% more energy-efficient over the native IS for several application benchmarks

    Compilation approach for coarse-grained reconfigurable architectures

    No full text
    Coarse-grained reconfigurable architectures can enhance the performance of critical loops and computation-intensive functions. Such architectures need efficient compilation techniques to map algorithms onto customized architectural configurations. A new compilation approach uses a generic reconfigurable architecture to tackle the memory bottleneck that typically limits the performance of many applications.close396

    Evaluating memory architectures for media applications on coarse-grained reconfigurable architectures

    No full text
    Reconfigurable ALU Array (RAA) architectures - representing a popular class of Coarse-grained Reconfigurable Architectures - are gaining in popularity especially for media applications due to their flexibility, regularity, and efficiency. In such architectures, memory is critical not only for configuration data but also for the heavy data traffic required by the application. In this paper, we offer a scheme for system designers to quickly estimate the performance of media applications on RAA architectures. Our experimental results demonstrate the flexibility of our memory architecture evaluation scheme as well as the varying effects of the memory architectures on the application performance.close0

    Exploiting Heterogeneous Mobile Architectures Through a Unified Runtime Framework

    No full text
    International audienceModern mobile SoCs are typically integrated with multiple heterogeneous hardware accelerators such as GPU and DSP. Resource heavy applications such as object detection and image recognition based on convolutional neural networks are accelerated by offloading these computation-intensive algorithms to the accelerators to meet their stringent performance constraints. Conventionally there are device-specific runtime and programming languages supported for programming each accelerator, and these offloading tasks are typically pre-mapped to a specific compute unit at compile time, missing the opportunity to exploit other underutilized compute resources to gain better performance. To address this shortcoming, we present SURF: a Self-aware Unified Runtime Framework for Parallel Programs on Heterogeneous Mobile Architectures. SURF supports several heterogeneous parallel programming languages (including OpenMP and OpenCL), and enables dynamic task-mapping to heterogeneous resources based on runtime measurement and prediction. The measurement and monitoring loop enables self-aware adaptation of run-time mapping to exploit the best available resource dynamically. Our SURF framework has been implemented on a Qualcomm Snapdragon 835 development board and evaluated on a mix of image recognition (CNN), image filtering applications and synthetic benchmarks to demonstrate the versatility and efficacy of our unified runtime framework

    Test-case generation for embedded simulink via formal concept analysis

    No full text
    Mutation testing suffers from the high computational cost of automated test-vector generation, due to the large number of mutants that can be derived from programs and the cost of generating test-cases in a white-box manner. We propose a novel algorithm for mutation-based test-case generation for Simulink models that combines white-box testing with formal concept analysis. By exploiting similarity measures on mutants, we are able to effectively generate small sets of short test-cases that achieve high coverage on a collection of Simulink models from the automotive domain. Experiments show that our algorithm performs significantly better than random testing or simpler mutation-testing approaches

    Unsupervised heart-rate estimation in wearables with liquid states and a probabilistic readout

    No full text
    \u3cp\u3eHeart-rate estimation is a fundamental feature of modern wearable devices. In this paper we propose a machine learning technique to estimate heart-rate from electrocardiogram (ECG) data collected using wearable devices. The novelty of our approach lies in (1) encoding spatio-temporal properties of ECG signals directly into spike train and using this to excite recurrently connected spiking neurons in a Liquid State Machine computation model; (2) a novel learning algorithm; and (3) an intelligently designed unsupervised readout based on Fuzzy c-Means clustering of spike responses from a subset of neurons (Liquid states), selected using particle swarm optimization. Our approach differs from existing works by learning directly from ECG signals (allowing personalization), without requiring costly data annotations. Additionally, our approach can be easily implemented on state-of-the-art spiking-based neuromorphic systems, offering high accuracy, yet significantly low energy footprint, leading to an extended battery-life of wearable devices. We validated our approach with CARLsim, a GPU accelerated spiking neural network simulator modeling Izhikevich spiking neurons with Spike Timing Dependent Plasticity (STDP) and homeostatic scaling. A range of subjects is considered from in-house clinical trials and public ECG databases. Results show high accuracy and low energy footprint in heart-rate estimation across subjects with and without cardiac irregularities, signifying the strong potential of this approach to be integrated in future wearable devices.\u3c/p\u3
    corecore